Unsupervised Discovery of Generic Relationships Using Pattern Clusters and its Evaluation by Automatically Generated SAT Analogy Questions
نویسندگان
چکیده
We present a novel framework for the discovery and representation of general semantic relationships that hold between lexical items. We propose that each such relationship can be identified with a cluster of patterns that captures this relationship. We give a fully unsupervised algorithm for pattern cluster discovery, which searches, clusters and merges highfrequency words-based patterns around randomly selected hook words. Pattern clusters can be used to extract instances of the corresponding relationships. To assess the quality of discovered relationships, we use the pattern clusters to automatically generate SAT analogy questions. We also compare to a set of known relationships, achieving very good results in both methods. The evaluation (done in both English and Russian) substantiates the premise that our pattern clusters indeed reflect relationships perceived by humans.
منابع مشابه
Automated Translation of Semantic Relationships
We present a method for translating semantic relationships between languages where relationships are defined as pattern clusters. Given a pattern set which represents a semantic relationship, we use the web to extract sample term pairs of this relationship. We automatically translate the obtained term pairs using multilingual dictionaries and disambiguate the translated pairs using web counts. ...
متن کاملMining the Web for Reciprocal Relationships
In this paper we address the problem of identifying reciprocal relationships in English. In particular we introduce an algorithm that semi-automatically discovers patterns encoding reciprocity based on a set of simple but effective pronoun templates. Using a set of most frequently occurring patterns, we extract pairs of reciprocal pattern instances by searching the web. Then we apply two unsupe...
متن کاملSpatial dynamics for relative contribution of cropping pattern analysis on environment by integrating remote sensing and GIS
Agriculture resources reflected to be one of the most imperative renewable and dynamic natural resources. Agricultural sustainability has the premier priority in all countries, whether developed or developing. Cropping system analysis is indispensable for grinding the sustainability of agricultural science. Crop alternation is stated as growing one crop after another on the same piece of la...
متن کاملIdentifying user habits through data mining on call data records
In this paper we propose a framework for identifying patterns and regularities in the pseudoanonymized Call Data Records (CDR) pertaining a generic subscriber of a mobile operator. We face the challenging task of automatically deriving meaningful information from the available data, by using an unsupervised procedure of cluster analysis and without including in the model any apriori knowledge o...
متن کاملClassification of Semantic Relationships between Nominals Using Pattern Clusters
There are many possible different semantic relationships between nominals. Classification of such relationships is an important and difficult task (for example, the well known noun compound classification task is a special case of this problem). We propose a novel pattern clusters method for nominal relationship (NR) classification. Pattern clusters are discovered in a large corpus independentl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008